Bayesian Modeling of Dependency Trees Using Hierarchical Pitman-Yor Priors

نویسنده

  • Hanna Wallach
چکیده

Recent work in hierarchical priors for language modeling [MacKay and Peto, 1994, Teh, 2006, Goldwater et al., 2006] has shown significant advantages to Bayesian methods in NLP. But the issue of sparse conditioning contexts is ubiquitous in NLP, and these smoothing ideas can be applied more broadly to extend the reach of Bayesian modeling in natural language. For example, a useful representation of higherlevel syntactic structure is given by dependency graphs are one such representation of this kind of higherlevel structure. Specifically, dependency graphs encode relationships between words and their sentencelevel, syntactic modifiers by representing each sentence in a corpus as a directed graph with nodes consisting of the part-of-speech-tagged words in that sentence.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian entropy estimation for countable discrete distributions

We consider the problem of estimating Shannon’s entropy H from discrete data, in cases where the number of possible symbols is unknown or even countably infinite. The Pitman-Yor process, a generalization of Dirichlet process, provides a tractable prior distribution over the space of countably infinite discrete distributions, and has found major applications in Bayesian non-parametric statistics...

متن کامل

Bayesian estimation of discrete entropy with mixtures of stick-breaking priors

We consider the problem of estimating Shannon’s entropyH in the under-sampled regime, where the number of possible symbols may be unknown or countably infinite. Dirichlet and Pitman-Yor processes provide tractable prior distributions over the space of countably infinite discrete distributions, and have found major applications in Bayesian non-parametric statistics and machine learning. Here we ...

متن کامل

Bayesian Unsupervised Word Segmentation with Nested Pitman-Yor Language Modeling

In this paper, we propose a new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference. Our model is a nested hierarchical Pitman-Yor language model, where Pitman-Yor spelling model is embedded in the word model. We confirmed that it significantly outperforms previous reported results in both phonetic transc...

متن کامل

A parallel training algorithm for hierarchical pitman-yor process language models

The Hierarchical Pitman Yor Process Language Model (HPYLM) is a Bayesian language model based on a nonparametric prior, the Pitman-Yor Process. It has been demonstrated, both theoretically and practically, that the HPYLM can provide better smoothing for language modeling, compared with state-of-the-art approaches such as interpolated KneserNey and modified Kneser-Ney smoothing. However, estimat...

متن کامل

A Hierarchical Nonparametric Bayesian Approach to Statistical Language Model Domain Adaptation

In this paper we present a doubly hierarchical Pitman-Yor process language model. Its bottom layer of hierarchy consists of multiple hierarchical Pitman-Yor process language models, one each for some number of domains. The novel top layer of hierarchy consists of a mechanism to couple together multiple language models such that they share statistical strength. Intuitively this sharing results i...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008